arXiv:1502.00799v2 [nlin.CD] 14 Mar 2015 


An early warning indicator for atmospheric blocking events using 

transfer operators 


Alexis Tantet, Fiona R. van der Bnrgt, and Henk A. Dijkstra 
Institute for Marine and Atmospheric research Utrecht, 

Department of Physics and Astronomy, 

Utrecht University, Utrecht, The Netherlands 

Abstract 

The existence of persistent midlatitude atmospheric flow regimes with time-scales larger than 5- 
10 days and indications of preferred transitions between them motivates to develop early warning 
indicators for such regime transitions. In this paper, we use a hemispheric barotropic model 
together with estimates of transfer operators on a reduced phase space to develop an early warning 
indicator of the zonal to blocked flow transition in this model. It is shown that, the spectrum of the 
transfer operators can be used to study the slow dynamics of the flow as well as the non-Markovian 
character of the reduction. The slowest motions are thereby found to have time scales of three 
to six weeks and to be associated with meta-stable regimes (and their transitions) which can be 
detected as almost-invariant sets of the transfer operator. From the energy budget of the model, 
we are able to explain the meta-stability of the regimes and the existence of preferred transition 
paths. Even though the model is highly simplihed, the skill of the early warning indicator is 
promising, suggesting that the transfer operator approach can be used in parallel to an operational 
deterministic model for stochastic prediction or to assess forecast uncertainty. 
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The transfer operator of a model of the atmosphere is used to isolate and 
detect persistent atmospheric regimes and to dehne an early warning indicator 
of transitions between these regimes. The methodology is expected to have 
application potential to general dissipative chaotic systems. 


I. INTRODUCTION 

The midlatitude atmospheric flow is considered to be a chaotic dynamical system for 
which predictability is limited DEI- Although the behavior of this flow is dominated by 
weather systems on short time scales caused by baroclinic instability, strong variability on 
time scales longer than 5-10 days, with a predominantly barotropic structure, is also observed 
[3]. It has been argued that at least part of the observed low-frequency variability can be 
explained by recurrent and persistent atmospheric regimes nn such as the North Atlantic 
Oscillation (NAO) and blocking events [HIH]. 

Many studies identifying atmospheric regimes use algorithms relying on the recurrence 
property of these regimes such as the k-means in ^ and the Gaussian mixture algorithms P 
[TO]. Other studies make use of persistence properties, for example leading to Hidden Markov 
Models nnini. Most of these techniques rely on the reduction of the high-dimensional phase 
space to a few dimensions. 

The existence of weather regimes in General Girculation Models (GGM) and in reanalysis 
has been questioned for some time unHis]. Using the Integrated Forecast System (IFS) of 
the European Gentre for Medium-Range Weather Forecasts (talk), it was shown [THHTS] 
that it was necessary to use a spatial resolution of T1279 (16 km), or to include stochastic 
parametrizations, in order for the atmospheric regime behavior to occur. This suggests that 
although the atmospheric regimes are large-scale low frequency motions, the faster small- 
scale motions (either explicitly resolved or included as random perturbations) are important 
to simulate them. 

The barotropic structure of midlatitude low-frequency variability has motivated early 
studies using low-order barotropic models. Gharney and DeVore [I9] have shown that such 
regimes could manifest themselves in highly truncated spectral barotropic models as stable 
hxed points representative of different solutions of a standing Rossby wave over topography. 
Flow regimes and spontaneous transitions have been observed in laboratory experiments 
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using rotating annulus experiments for a barotropic fluid with topography [20] and for a 
two-layer shear flow In [20], a zonal flow and blocked flow were found for different 

values of the Rossby number, and spontaneous transitions between the two were observed 
for intermediate values of the Rossby number. They suggested that these transitions were 
associated with the existence of two basins of attraction connected by heteroclinic orbits. 

A scenario of chaotic itinerancy [25l[26] permitted by heteroclinic connections is supported 
by the study of Crommelin et al. [2Z| using a 6-mode barotropic model. For specihc values 
of the forcing parameter, the two stable hxed points of the zonal and blocked regimes merge 
with a periodic orbit (due to barotropic instability), yielding a heteroclinic connection. 
Although such a specihc situation is unlikely to exist in the real atmosphere, Crommelin 
[28] found evidence of ruins of such a heteroclinic connection in a hemispheric barotropic 
model with realistic topography and forcing [29], manifested by the presence of preferred 
transition paths. Regime behavior was also found in more realistic barotropic [2RII2UII2T] and 
multilayer quasi-geostrophic models un Because these models exhibit chaotic behavior, 
the regimes are no longer identihed by stable hxed points but rather as neighborhoods in the 
phase space where trajectories tend to persist, motivating their denomination as meta-stable 
regimes. 

In the laboratory experiments by Williams [2T], it is the inertia-gravity waves which are 
responsible for the regime transitions when the how is baroclinically unstable. Such waves 
and barotropic disturbances occur in the real atmosphere together with baroclinically unsta¬ 
ble synoptic weather systems, so that it is not yet clear if one disturbance is more important 
than the other in inducing certain regime transitions. All these possible mechanisms suggest, 
however, that the variability of the midlatitude atmospheric circulation can be captured by 
a deterministic model with multiple basins of attraction ‘forced’ by random perturbations 
representative of high-frequency eddies. 

This multi-scale property of the climate system motivates a stochastic-modeling approach 
[22] to climate variability [22]. Stochastic climate modeling often relies on a time-scale 
separation where the state vector is decomposed into a slow climate component and fast 
weather fluctuations. These fast fluctuations are not resolved explicitly but their aggregated 
effect is represented by a noise term. The system is thus modeled by a Stochastic Differential 
Equation (SDE) [32] with in general non-linear deterministic terms and additive and/or 
state-dependent noise terms [2nj27] . When the time-scale separation assumption is violated. 


3 


the Mori-Zwanzig formalism shows that a non-Markovian term representative of the memory 
effect induced by past interactions between the resolved and the unresolved variables has to 
be added [55VHT] . Stochastic modeling has been applied to many problems in climate science, 
such as subgrid-scale parametrization, uncertainty quantihcation and data assimilation m 

112 ]. 

When randomness is present in a dynamical system, whether it is because of uncertainty 
in the initial state of chaotic systems or because of a stochastic forcing, it is of interest to 
study the evolution of probability densities in phase space by the flow rather than that of 
individual trajectories [mill]. This evolution is given by the transfer operator whose point 
spectrum, the Ruelle-Policott resonances [I5ljl9] - valuable information on the slow 
dynamics of the system. For mixing dissipative systems, these resonances are associated 
with a slow correlation decay and the manifestation of meta-stability |44j . 

The main purpose of the present study is to develop an early warning indicator of transi¬ 
tions between atmospheric flow regimes. A traditional method that gives an early warning 
of a sudden transition is the use of the critical slow down of the system when it gets close 
to a bifurcation point [50]. However, in a high-dimensional model such as that used in 
[2H] it is too simplistic to reduce the topology of the system to one or more stable hxed 
points. Recently, it was shown that complex networks can reveal information on nearby 
simple bifurcations in high-dimensional dynamical systems [51] • Near bifurcation points, 
the topology of the network changes drastically and early warning indicators for transitions 
were developed based on these topological changes [521154] . 

In this study, we base the early warning indicator on the evolution of probability densities 
with respect to meta-stable regimes in a reduced phase space of the barotropic model used 
in Crommelin [28]. To study the slow dynamics in this phase space and to evaluate the 
effect of memory induced by the reduction, the spectrum of transfer operators estimated for 
different lags is analyzed. Meta-stable regimes are subsequently detected from the transfer 
operator at a carefully chosen lag. An early warning indicator of transition to the blocked 
regime is developed from the transfer operator making use of the existence of preferred 
transition paths between the regimes. To test the quality of the early warning indicator, 
a traditional method employing the Peirce skill score [55l - [57j is used. Finally, a study of 
the energy budget of the barotropic model is performed, where particular attention is given 
to the conversion of mean kinetic energy to eddy kinetic energy by Reynolds’ stresses, to 
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provide a physical background to the early warning indicator. 


II. REDUCTION OF THE T21-BAROTROPIC MODEL 

A. Model and Data 

Transitions between zonal and blocked regimes of the northern hemisphere atmospheric 
circulation are here investigated using a barotropic model |2Hl EH], |36] . The dimensionless 
equation of the model, expressed in terms of the streamfunction -0 (representing the non- 
divergent flow) and using the mean radius of the Earth and the inverse of its rotation rate 
as horizontal and temporal scale, is given by the barotropic vorticity equation (BVE) 

+ / + h,) - ( 1 ) 

at 

where J' denotes the Jacobian operator, / the Coriolis parameter, hb the scaled orography, 
ki the Ekman damping coefficient, k 2 the coefficient of scale-selective damping and V^'0* 
the prescribed vorticity forcing. The non-dimensional orography hb is related to the one of 
the real Northern Hemisphere h[ by 

hfe = 2Ao^ sin0o, (2) 

where 0o = 45°N, Aq = 0.2 is a factor determining the strength of the surface winds that 
blow across the topography, and is a scale height of 10 km. 

The BVE is projected onto spherical harmonics, triangularly truncated at the 21®* mode 
(T21). The spherical harmonic coefficients are chosen such that the model is hemispheric 
with no flow across the equator, resulting in a system of 231 Ordinary Differential Equations 
(ODE), which are integrated using a fourth order Runge-Kutta numerical scheme. Following 
[29] . the Ekman damping time scale and the scale-selective damping time scale were chosen 
as 15 days and 3 days (for wavenumber 21), respectively, so as to adequately reproduce 
the observed mean and variance of the 500hPa Northern-Hemisphere 10-day mean relative 
vorticity. The vorticity forcing is calculated from ECMWF reanalysis data of winter¬ 

time 500hPa relative vorticity from 1981 to 1991, in order for the hrst two moments of the 
simulated relative vorticity to be as close as possible to the observed relative vorticity. The 
term is calculated [29] |58] according to 

+ f + hb) + (3) 
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where V’d is the mean of the observed streamfunction. The quantity is the deviation of 
the 10-day running mean observed streamfunction from ipci- The present study relies on a 
500,000-day-long simulation using an integration time-step of 30 minutes, with daily output 
and a spin-up of 5,000 days removed. 


B. Phase-space reduction 

In order to investigate the presence of meta-stable regimes and preferred transition paths, 
it is important to define a proper reduction of the 231-dimensional phase space. This is 
usually done by projecting the state vector on a low-dimensional basis of orthogonal vectors 
such that the variance of the projected state vector is maximized. In practice, this can be 
achieved by an Empirical Orthogonal Function [EOF, e.g [59] analysis of the streamfunction 
normalized by the kinetic energy norm [2H1 ES] • The streamfunction patterns of the three 
leading EOFs are represented in £gure[T]and explain 36.7% of the total variance. The hrst 
EOF is related to the Arctic Oscillation (AO), blocking events and the strength of the polar 
vortex [28|. As it shows a dipole-like pattern over the Atlantic basin, the third EOF has 
been associated with the North Atlantic Oscillation (NAO). 

For the purpose of this study, special care must be given to the choice of the EOFs 
used to dehne the reduce phase space. Indeed, the presence of meta-stable regimes and 
the transitions between them should not be hidden by the projection. Furthermore, for 
the sake of time-scale separation between the motions in the reduced phase space and the 
unresolved ones, it is important for the principal components of the selected EOFs to show 
decorrelation times as large as possible compared to the other principal components. These 
decorrelation times, dehned as the time scale after which the autocorrelation function has 
decayed to 1/e of its value at lag 0, are plotted in figure for the 20 leading principal 
components. Largest decorrelation times are found for principal components 1, 2 and 3 
(41, 18 and 15 days, respectively), while other principal components have a decorrelation 
time shorter than 10 days (with many smaller than 5 days). Because Crommelin |2B] has 
shown that meta-stability was mainly visible on the first principal component (pc^), while 
transition between the meta-stable regimes occurred through low and high values of principal 
component pcg, we have chosen to define the reduced phase space Y as the (EOFi,EOF 3 ) 
plane. The projection of the state vector x G X on the reduced phase space Y is then given 
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FIG. 1; Dimensionless streamfunction of the three leading EOFs of the barotropic model 
EOF 1 (a), EOF 2 (b) and EOF 3 (c) that explain 21.4%, 8.5% and 6.8% of the variance 

resp^tively. 
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FIG. 2: Decorrelation times (in days) of the 20 leading principal components. 


by the observable y obtained by h : i—)■ = (pc^(t), pc 3 (t)). 

The two-dimensional normalized histogram of pcj^ and PC 3 for the 500,000-day-long sim- 
nlation is (cf. hgnre similar to figure 7 in [28]. To estimate the histogram, principal 
components pc^ and PC 3 were normalized by their respective standard deviations and the 
(EOFi, EOF 3 ) plane was discretized into a grid G of 50 x 50 boxes spanning a [—3, 3] x [—3, 3] 
square. Grid boxes at the boundary are extended so that all realizations belong to the grid. 
Furthermore, if a box contains no realization, it is removed from the grid, as it is not likely 
to support part of the projection of the attractor. This resulted in a grid of m = 1577 boxes 
containing on average 314 realizations and such that 80% of the boxes contain at least 20 
realizations. The components Hi of the normalized histogram H for grid-box Bi are then 
calculated as the likelihoods P(|/t G Bi) = ^{yt G Bi\/jj^{yt G G}, where G Bi} is the 
number of realizations of the observable y in box Bi and ij^{yt G G} is the total number of 
realizations. 

The histogram in figure shows two local maxima indicative of the presence of recurrent 


or persistent regimes. These regimes will be precisely dehned in section IV G For now, 
following Grommelin [2S|, we associate the local maximum for negative pci to the blocked 
regime (since it corresponds to high pressure over north-western Europe) and the maximum 
for positive values of pcj^ to the zonal regime. Grommelin [28] also showed that the transitions 
from the zonal to the blocked regime were preferably going through negative values of PC 3 , 
interpreted as the positive phase of the NAO. 
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FIG. 3: Normalized histogram of the first and third leading principal components 
normalized by their respective standard deviations and discretized nsing 50 x 50 grid boxes. 


The dynamics of the state vector x of the barotropic model on the phase space X are 
deterministic and Markovian. However, application of the Mori-Zwanzig formalism [3HI EHl 
E] indicates that in a closed model of the dynamics of the observable y, memory terms 
acconnting for past interactions between the resolved and the nnresolved variables have to 
be inclnded together with an, in general non-white, noise term. Under the assnmption that 
the time scales of the resolved variables are slower by several orders of magnitnde than those 
of the nnresolved variables, these memory terms can be neglected |35l |60]. As can be seen 
in fignre the decorrelation time of the nnresolved variables pc 2 is not smaller that the one 
of the resolved variables pc^ and pcg and hence time-scale separation does not apply and 


memory effects shonld be considered. In sections and W we show how transfer operator 
techniqnes can provide detailed information on this issne. 


III. TRANSFER OPERATORS, RESONANCES AND THEIR ESTIMATION 


The spectrnm of transfer operators giving the evolntion of densities indnced by the flow 
can provide valnable information on the time scales of the dynamics; in this section we show 
how these operators can be approximated on a rednced phase space. 
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A. Transfer operators of dissipative dynamical systems 


We consider an antonomous dissipative dynamical system (DDS) governed by the ODE 

X =F(x), X G A, (4) 

x(0) = xo, 

where X = is an Enclidean vector space and F : X —)■ X a smooth vector held with 
associated how The evolntion of a distribntion / of V [M] indnced by the how 

{S',-} is described by a family of linear operators {£T-}r >05 namely the Perron-Frohenius 
operators or transfer operators, snch that 

<^Tf,g>=< f,go<Sr>, for all geC°°, (5) 


where o is the composition operator and < f,g > is the action of the distribntion / on the 
smooth fnnction g. The family {Ct}t>^o is a one-parameter semigroup, that is 

Cri+T2 = ; A = ^ (6) 


While the more intnitive transfer operators acting on (integrable) probability densities 
[chap 7.4 in 03] will be considered in section III B to V , it has been shown that the statistical 


properties, snch as the decay rate of correlations, of several mixing dissipative systems are 
related to the point spectrnm of transfer operators acting on (larger) spaces of distribntions 
[451 02] • Snch statistics are provided by an ergodic invariant measnre. A measnre pi of 
a measnrable space (X, i3(X)) is invariant nnder the how if p,{Sf^{A)) = p,{A) for all 
sets A in B{X). It is ergodic if all invariant sets A are trivial snbsets of X {pi{A) = 0 or 
pi{A) = 1). Moreover, a dynamical system is mixing if lim^^oo A^(A fl Sp^{B)) = p,{A)p,{B), 
in which case the correlation function 


= j fg° Sr- jfjg ioT f E L\g e L°^, (7) 

converges to zero as the lag r goes to inhnity [chap 4.4 in 145]. 

In the case of a DDS for which the attractor has Lebesgne measnre zero, it is important 
to choose a physical invariant measnre for which spatial and temporal averages of continnons 
observables are identical for an initial set of positive Lebesgne measnre [GS] 04] . It has been 
shown, for Anosov hows, that a physical measnre of Sinai-Rnelle-Bowen (SRB) type [64] 
exists and that an appropriate Banach space can be dehned on which the semigronp 


10 




of transfer operators is bounded and strongly continuous [chap. 1.5, [65], with its point 
spectrum Pa{Cr) located inside the complex unit-disk [ 3 H]- 

Whether the system considered in this study is Anosov is a difficult question. However, it 
has been observed that many-particle systems with sensitive dependence to initial conditions 
’’behave” as Anosov [60] • We will follow this so-called chaotic hypothesis and assume that the 
transfer operators associated with the flow {S'T-}r>o constitute a strongly continuous 

semigroup and that there exists a unique SRB measure (i.e, 1 is the only eigenvalue of the 
transfer operators located on the unit disk, the system is mixing). 

For a bounded and strongly continuous semigroup, there exists an inhnitesimal operator 
A, namely the generator of the semigroup, and u{t) = Crf is the unique solution (Vr > 0) 
of the following Cauchy problem [chap. 2.6 in l65] 


u{t) = Au{t) ; m(0) = /. 


( 8 ) 


For transfer operators, the linear system ([^ is the Liouville eguation which replaces the 
non-linear problem Q at the expense of having to deal with a Partial Differential Equation 
(PDF). 

The Spectral Mapping Theorem (SMT) [chap. 4.3.7, [65] allows us to relate the point 
spectrum Pa {A) of the generator of a strongly continuous semigroup to the point spectrum 
Pa{Cr) of the semigroup operators, according to 


Pa{Cr) \ { 0 } = 


(9) 


A vector 0 is called an eigenvector associated with the eigenvalue A in Pa{Cr) if tC,T-<p = A0. 

The SMT will be used in section IV A to associate the spectrum of the transfer operators 
to motions of different time-scales. To hx the ideas, let us take an eigenvector 0 (possibly 
a distribution) associated with a non-zero eigenvalue A of Pa{Cr). For any observable g (a 
smooth test function) 


< (f),g o Sr >= < Cr(t),g > 

(10) 

= X<(j),g> . 

(11) 

Applying the SMT, there exists an a in Pa{A) such that A = e*“. 

so that (]ll[l becomes 

<(t),goSr>= <(l),g> ■ 

(12) 
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Thus, the term < o Sr > converges exponentially fast to zero at the rate —Re{a) > 
0. Loosely speaking, the point spectrum of the generator A close to the imaginary axis 
(correspondingly, the point spectrum of Cr close to the unit circle) is responsible for a slow 
rate of decay of correlations in the direction of their eigenvectors. This point spectrum, called 
the Ruelle-Pollicott (RP) resonances [151 - IT7] . is therefore a candidate for a mathematical 
explanation of the observed low-frequency variability such as induced by meta-stability. 
This important fact motivates the estimation of the RP resonances to study the dynamics 


associated with meta-stable regimes in section IV In sections III B to IV, we present how 
these resonances can be calculated from approximations of the transfer operators on the 
reduced phase space. 


B. Estimation of transfer operators on the reduced phase space 

For Anosov systems, the stability of the SRB measure [63 EH], the transfer operators [69| 
and the dominant part of their spectra jiQiin] to perturbations, has motivated the discrete 
approximation of transfer operators using Ulam-type methods EHl EHl [12] • 

For low-dimensional systems, Ulam’s method relies on a Galerkin approximation of the 
inhnite-dimensional transfer operators by hnite-dimensional matrices EHl ESI El]- For this 
purpose, the phase space is discretized into a grid G = of m grid boxes dehning an 

orthogonal basis of characteristic functions. The transfer operator Cr is then approximated 
by a time-homogenous Markov chain with transition matrix Pr whose elements are the 
transition probabilities 


(Pr)y ^ ^{Xt+r e Bj\xt e Si), 


(13) 


of a trajectory of x in box Bi to pass through Bj after a lag r. These probabilities can be 
estimated from a long time-series {xt}i<t<N of x using the Maximum Likelihood Estimator 
(MLE) 


{.PT)ij — lP(^i+T £ Bj\xt € Bi) 


H^{{xt G Bi) A (a^i+r G Bj)} 
G Bi] 


(14) 


where #{a;t G Bi} is the total number of realizations of x in Bi and #{(xi G Bi) A {xt+r G 
Bj)} is the number of realizations of x in box Bi ending in Bj r-time later. As N goes to 
inhnity, {Pr)ij converges to {Pr)ij with an error of order 0{N~^P) [75] . 
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Because, the high dimensionality of the phase space of the barotropic model {d = 231) 


prohibits the direct approximation of Cr and because, as described in section |IIB| and in 
Crommelin [25], the dynamics relevant to transitions between meta-stable regimes predom¬ 
inantly occur in the reduced phase space Y = (EOFi, EOF 3 ), we instead estimate the tran¬ 
sition probabilities of a Markov process associated with the dynamics in the reduced phase 
space Y [19|, by replacing x in (13) and (14) by the observable y = h{x.). Markov operators 


approximated by these transition probabilities cannot be expected to give the transfer of 
densities in the reduced phase space as induced by the flow S^. Nonetheless, it has been 
shown by Chekroun et al. [421 Theorem A], for systems with a unique SRB measure and 
a continuous observable h, that the transition probabilities in the full phase-space X, are 
related to the transition probabilities of the Markov process on the reduced phase-space Y 
by 


f'iyt+T e Bj\yt e Bi) = F{xt+r e h iBj)\xt e h ^{Bi)) 


(15) 


Hence the transition probabilities in X restricted to pairs {h~^{Bi), h~^{Bj)) are exactly the 
transition probabilities in Y for pairs {Bi,Bj). As a consequence, the estimated transition 
matrices {TV} contain information on the transfer operators {/It} and their RP resonances 
as hltered by the observable. 

Accordingly, we have discretized the reduced phase space Y using exactly the same 50 x 50 

Then, the matrices {TV} were estimated from 


grid as for the histogram as in section 


IIB 


the 500,000-day-long time-series of y and for different lags r. Importantly, {P.,-}.,->o need not 
be a semigroup because (i) the partial observation of the system introduces memory effects 
[38] l39] , (ii) the Galerkin approximation adds numerical diffusion US] and (iii) the estimation 
of transition probabilities from a time-series of limited length is prone to sampling errors 


(cf. section IV below). 


Before to study in detail the spectral properties of the transition matrices, we remark that 
these matrices are row stochastic, implying that their eigenvalues satisfy |A| <1. In our case, 
there exists necessarily a unique eigenvalue equal to 1 because the transition matrices {Pt}, 
by construction from a unique time series, represent irreducible Markov chains [77]. The 
uniqueness of this eigenvalue in turn implies that its associated eigenvector vf = {tti, ..., vr^}, 
or hxed point, is in fact identical to the normalized histogram H of £gure[^ 


13 







Indeed the following calculation shows that the histogram H is invariant for any 

(HPr), = EI".l 

_ #{yteBk} _ #{{yt&Bk)A{yt+T&Bj)} 

^k=l #{yt£G} ^ #{yteBk} 

_ #{{yt&Bk)A{yt+T&Bj)} 

- 2^k=l #{yt£G} 

_ #{yt+T&Bj} 

~ #{yt&G} 

H,. 

Thus, each component vTj yields the likelihood P(j/t G Bi) for realizations of the observable 
to be in Bi. It is straightforward to associate to any m-dimensional vector on the grid G the 
corresponding density it approximates na. Not doing the distinction between a probability 
vector and the corresponding probability density should thus not lead to any confusion. 


IV. SPECTRAL PROPERTIES, MEMORY AND ALMOST-INVARIANT SETS 


A. Spectral properties and slow dynamics 


We now apply the Spectral Mapping Theorem (SMT) in order to approximate the RP 
resonances (the dominant eigenvalues of the generator) from the estimated transition ma¬ 
trices Pr, as hltered by the observable h. For this purpose, we hrst solve, for each estimated 
transition matrix in {Pt}, the eigenvalue problem 

ei{T)Pr = \i(T)G(r) for i e {1,..., m}, (16) 


where ej(r) is the left-eigenvector associated with the eigenvalue Aj(r) of Pr- The spec¬ 
trum of Pr changes with the lag r. However, if {Pr}r>o was a semigroup with generator 
Am, the spectrum {ai} of Am would be independent of r and applying the SMT would give 


Ti = —Re{ai) = — log |Ai(r)| for r > 0 and for i G {1,..., m}, 


(17) 


where r* would be the exponential rate of decay of correlation of any observable in the 
reduced phase-space with the eigenvector associated with Aj. 

the transition matrices {Pr}T>o do not necessarily inherit 


However, as noted section 


HIB 


from the semigroup property ([^ of the transfer operators {£r}T>o- Thus, no generator may 
exist and the rates in (17) may depend on the lag r. Nevertheless, calculating the rates 
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FIG. 4: Rates ^^(r) corresponding to the 10 leading eigenvalues different from unity of 
each Pr, with the lag r as abscissa and the (cyclic) coloring giving the rank of the rate. A 
complex pair of eigenvalues is represented by one square for the two conjugates. The error 
bars represent 99% confidence intervals estimated from a thousand surrogate transition 
matrices by applying the bootstrap method described in Appendix 


rj(r) for each lag allows (i) to give an approximation of the dominant RP resonances with 
a control on the lag and (ii) to test the semigroup property ([^. 


For this purpose, we calculated the rates ^^(r) by solving the eigenvalue problem (16) and 
applying ( [ItI ) for r ranging from 1 to 39 days. The leading rate equals 0, since it is associated 
to the unit-eigenvalue. The rates corresponding to the 10 leading eigenvalues different from 
unity of each iR are represented figure with the lag r as abscissa and the (cyclic) coloring 
distinguishing the rank of the rate. A complex pair of conjugate eigenvalues is represented 
by one square for the two conjugates. The error bars represent 99% confidence intervals 
estimated from a thousand surrogate transition matrices by applying the bootstrap method 
described in Appendix 


Our first observation concerns the small width of the confidence intervals (some are even 
hidden by the marker size). These intervals evaluate the robustness of the estimates to the 
limited length of the time-series. The largest intervals occur when two rates almost overlap, 
so that one of them may appear or disappear in the surrogates, resulting in a change of rank 
for all higher-rank rates. In our case, large confidence intervals are thus usually indicative 
of an uncertainty in the existence of two close but distinct rates or of only one (as for the 
leading rates at r = 7 or 8, for example). Moreover, Appendix [A| shows that at least the 
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three leading rates are very robust to changes in the the grid resolution, ranging from a 
10 X 10 grid to a 100 x 100 grid, compared to the original 50 x 50 grid. 

Being conhdent on the robustness of the rates to the sampling as well as to the grid 
resolution, we now turn to a detailed analysis of the results in hgure First, we can see 
that the two rates closest to zero, the leading rates in red and green, are well separated from 
the rest of the spectrum after a lag between 5 and 10 days. One of these rates derives from a 
real eigenvalue (the circle), the other represents a pair of complex conjugate eigenvalues (the 
square); they exchange rank at a lag of 8 days. We can calculate an indicative time scale 
associated with each rate as the inverse of the rate. Whether this time scale corresponds, to 
a good approximation, to a decorrelation time is not the matter of this study. For a lag of 15 
days, the hrst rate (in green) and the second rate (in red) correspond to a time scale of 40 and 
23 days respectively. Furthermore, the second rate is separated from the third (in cyan) by 
a gap of 13 days. This time-scale separation suggests that the reduced dynamics are slowly 
mixing due to the presence of meta-stable regimes responsible for low-frequency variability 
[IHl [THl [HQ]. This conhrms the work on meta-stable atmospheric regimes in this model 
by Crommelin [28]; we will see in section IV C how such regimes can be more objectively 
detected. 

The second important feature, relevant to the problem of stochastic prediction of section 


VB, is the relative Constance of the two leading rates for lags larger than 8 days. We can 


say that the slow dynamics associated with these rates ’’behave as Markovian”, by which 
we mean that looking only at these rates, one cannot disprove the semigroup property (|^, 
even though the dependance on the lag of the other rates is clearly indicative that {iV}r>o 
cannot constitute a semigroup. Thus, the two leading rates do not seem to be affected by 
memory effects due to the partial observation of the system or by estimate errors, since one 
would not expect them to be constant otherwise. 

From the separation of the two leading rates from the other rates as well as their relative 
independence on the lag r for lags larger than 8 days, we expect 8 days to be the minimum 
lag for which the transition matrix Pr=% is likely to predominantly resolve the dynamics 
associated with the meta-stable regimes. Consequently, the following developments will rely 
mostly on the transition matrix Pr=8- Such strategy for the choice of the lag is similar to 
the one of DelSole [8T| and Berner [82|, who look directly at the decorrelation rate of their 
time series to infer for which lag they should estimate the drift and diffusion coefficients 
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of the Fokker-Planck equation they want to approximate. In our case, however, all the 
rates associated with the dominant eigenvalues of the transfer operators are considered and 
not only the decorrelation rate of the time series alone. Let us also acknowledge that the 
SMT has been used in Crommelin and Vanden-Eijnden [H3] to estimate the spectrum of 
the generator associated with a Fokker-Planck equation, in Froyland et al. I7SI to compare 
approximates of transfer operators and generators of low-dimensional dynamical systems 
and in Franzke et al. |T2] to test the meta-stability of Hidden Markov Models. However, we 
do not know other studies using the SMT to test the ” Markovianity” of reduced systems, 
even though the r-test in [SI] serves a similar purpose in the context of Linear Inverse 
Modeling. 


B. Further validation of the semigroup property 

We further test to which extent the semigroup property (|^ is violated for special cases 
of powers k of Pr and the corresponding matrices Pkxr- Rather than directly calculat¬ 
ing a distance between the matrices (PV)^ and Pkxr, we prefer to calculate, for an ini¬ 
tial density /o, the distance between the density = /o(Rt)^ transferred by the /c*^ 

power of Pr and the density = foPkxr transferred by Pkxr- We use the distance 

dif,9) = VTZ -1 ~ QiPi which is a sum of squared errors between the components of 

/ and g, weighted by the likelihoods F{yt G Bi), so that errors in grid boxes less likely to be 
reached by the Markov process are given less weight. The distances have been 

calculated for a set of m initial densities associated with each grid-box in {Bi}, so that the 
density associated with box Bj integrates to 1 in Bj and to zero elsewhere. The distances 
are represented hgure and b for a lag r of 8 days and for a multiplicator fc of 2 and 4, 
respectively (dark colors indicate small distance). 

To explain the nature of the violation of the semigroup property ([^, let us recall that there 
exists only three possible candidates, namely (i) the partial observation of the dynamical 
system, (ii) the coarse-graining induced by the Galerkin approximation and (iii) sampling 
errors. Furthermore, it is shown in Appendix that the distances plotted hgure are not 
affected by the use of twice as more samples and that they only decrease slightly when the 
resolution increases (and vice versa). Thus, we can say that positive distances in hgure 
are mostly the consequence of memory ehects induced by the partial observation of the 
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FIG. 5: The value at each grid-point represents the distance, for an initial density of 1 at 
this grid-point, between the density transferred by the power of Pg and a density 

transferred by P^xs for (a) = 2 and (b) k = A. 


high-dimensional barotropic model. 

We can observe that these memory effects are mostly important where the stationary 
density is small (hgure and thus, where trajectories are less likely to pass by. The denser 


regions which are associated with meta-stability, as will be seen section IV C seem to be 
less affected by memory effects. This result is in agreement with the relative constancy with 
lag of the leading rates, also associated with meta-stability. Such memory effects, as well 
as the dependence on the lag of the spectral gap between the leading rates, should be put 
in perspective with stochastic modeling with Stochastic Differential Equations (SDE) and 
based on time-scale separation [Ml EH EZ]. Indeed, in such models, one seeks a reduced 
basis for the decorrelation time of the resolved variables to be much longer than the one 
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of the unresolved variables, in order to be able to neglect the memory effects made explicit 
by the Mori-Zwanzig formalism [HSl IHU] . If the transition matrices on the reduced phase 
space do not directly help in the choice of the basis on which the dynamical system can 
be optimally reduced, their rates can give insights on the impact of the reduction on the 
resolved variables. 


C. Meta-stable regimes as almost-invariant sets of the transfer operator 


We have seen, section IV B that the two leading rates of hgure are close to zero and 
that a large spectral gap separates them from the rest of the rates, a conhguration indicative 
of the presence of meta-stable regimes. This characteristic of meta-stability or persistence 
allows to formally dehne these regimes as almost-invariant sets [7^ l80] . We now give an 
extension of the dehnition of almost-invariant sets to sets in the reduced phase space and 
present an algorithm to detect them from a transition matrix Pt-. 

A set A of the phase space X is almost-invariant if S'T^(A) A, so that 


V’{xt+T £ A\xt G A) 1. 


( 18 ) 


Reformulating, the probability for a trajectory starting in a set A to leave this set after a 
lag r is almost-zero. These sets are thus associated with persistent or meta-stable regimes. 

In the case of almost-invariant sets in the reduced phase space, we are interested in sets 
E of Y, almost-invariant with respect to the transition probabilities P(2/f+r G E\yt G E), 
such that 


F{yt+r G E\yt e E) ^ 1. 


(19) 


However, applying (15) [IHl Theorem A] we have that 

F{yt+r e E\yt e E) ^ 1 F{xt+r & h~^{E)\xt e h~^{E)) ^ 1. 


( 20 ) 


This important result states that if a set E is almost-invariant in the reduced space V, its 
pre-image h~^{E) in X is almost-invariant to the flow S^. In other words, almost-invariant 
sets in the reduced phase space are images of almost-invariant, yet coarser, sets in the full 
phase space. Of course, these coarse-grained almost-invariant sets may not be optimal, in 


the sense that other, more strongly almost-invariant sets (w.r.t (19)) may exist but are 
hltered out by the observable h in the same way RP resonances can be hltered out by h. 
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Based on these considerations, the transition matrix Pr=8 was used to dehne the meta¬ 
stable regimes objectively. For the detection of almost-invariant sets [see also. 1781 - 180] we use 
an optimal Markov chain reduction [85l l86] with respect to the relative entropy rate |87| . 
This type of Markov chain reduction is particularly well suited for the detection of dense 
almost-invariant sets (highly recurrent), since it attempts to minimize the distance between 
a density transferred by the reduced transition matrix (giving the transition probabilities 
between the almost-invariants, see below) and the same density transferred by the original 
transition matrix. The optimization was implemented using the greedy algorithm from 
network theory [88] , where the grid-boxes are iteratively merged to give coarser and coarser 
almost-invariant sets. 

In accordance with the bimodality of the histogram (hgurej^, we have chosen to look 
for a number of almost-invariant sets p of 2. These two sets are plotted in hgure such 
that all grid boxes in green belong to the hrst almost-invariant set and all grid boxes in blue 
belong to the second one. For the family of almost-invariant sets {Ejs}, the 2-by-2 reduced 
transition matrix Qr=8,, such that (( 5 r= 8)/37 = ^{Vt+T ^ E.y\yt G Ejs), and its stationary 
density r], such that rjjs = G Ep), are found to be 


Qt=8 — 


1 0.79 0.2l\ 

1^0.14 0.86y ^ 



( 21 ) 


the second almost-invariant set (in blue) being almost three times as dense as the hrst one 
(in green). 

The algorithm is designed to hnd almost-invariant sets whose union covers the entire 
grid. However, in view of the early warning problem discussed in section |V| we need to 
hnd a restriction of the dehnition of the regimes {Ry} to smaller regions of the grid so 
that the likelihood P(|/t G Ry) to be in any regime Ry becomes smaller than one [H9]. To 
do so, we selected, for each almost-invariant set Ep, their grid-boxes Bi maximizing the 
likelihood ^{yt G Bi^yt+r G Ej^^yt-r G Ey) of a realization of y to be in B^ and to come 
from and go to the same almost-invariant set Ep, until a sufficiently large number of boxes 
have been attributed to the regime R^ to have F{yt G R^) = P(?/i G Ep)/2 (until half of the 
almost-invariant set has been selected in terms of stationary density tt). These restrictions 
are plotted in dark green and dark blue (Fig. and dehne the blocked and zonal regimes, 
respectively. The probability to stay in the so-dehned blocked and zonal regimes after 8 
days is of 66% and 70%, respectively. 
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FIG. 6: Two almost-invariants sets and their restrictions (in dark colors) corresponding to 

the blocked and the zonal regime, respectively. 

It is shown in the Appendices and that these regimes are relatively robust to the 
limited length of the time series, to the grid resolution and to the lag, respectively. Their 
definition together with the transition matrices {A-} are used in section|^to define an early 
warning indicator of a transition to the blocking regime. 

V. PREFERRED TRANSITION PATHS AND EARLY WARNING 

A. Preferred transition paths 

Having defined the regimes, we now study the transitions between them. Following 
Branstator and Berner [90], we hrst plot the mean tendencies of the normalized principal 
components pci and pcs. The tendencies were calculated for each principal component 
using a hnite difference scheme such that ApCj(t) = (pCj(f + At) — pCj(t))/Af, where ApCj 
is the approximate tendency of the principal component and At is the time step. An 
estimate of the mean tendency for each grid box was then calculated by averaging over all 
the realizations of y in this grid box. 

The mean tendency for a time-step At of 8 days is plotted figure It can be seen 
as a composition of a clockwise rotation and two sinks. This result corroborates both 
the meta-stability of the regimes and the existence of preferred transition paths between 
them, reminiscent of a pseudo-periodic orbit [H |28]. Indeed, the rotation is such that 
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FIG. 7: Mean tendency of the normalized principal components calculated using centered 
differences for At = 8 days. Arrows represent the direction and the colors represent the 
magnitude (in standard deviation per day). As for the following hgures, the black contours 
delimitate the blocked regime, marked by the letter B, and the zonal regime, marked by 

the letter Z. 


typical trajectories leaving the zonal regime (blocked regime) to go to the blocked regime 
(zonal regime) transit through negative values (positive values) of pca. Furtermore, the 
correspondence between the sinks with low-values of mean tendency and the regimes is 


striking, in particular for the zonal regime. We have seen in section |IVB| that the memory 
effects are relatively weak in the region of the regimes. In the limit when these effects can be 
neglected, the reduced dynamics can be modeled by an SDE and the mean tendency gives 
an approximation of the drift term involved in the Fokker-Planck equation [together with 
diffusion, not calculated here, 1321E2] which generates the semigroup of transfer operators 
associated with the SDE. The correspondence between weak tendency and almost-invariance 
is thus not coincidental. 


To further support the existence of preferred transition paths from one regime to the 
other, we have calculated, for each grid box, the likelihood Pzb (Pbz) that a trajectory 
starting in the zonal regime (blocked regime) and passing through this grid box reaches the 
blocked regime (zonal regime) before the zonal regime (blocked regime). 

The resulting likelihoods are plotted in hgure In agreement with the tendency, the 
trajectories going from the zonal to the blocked regime are more likely to do so through low 


22 














0.5 K, 

0.4^ 

0.3 

0.2 

0.1 


FIG. 8: Likelihood, for each grid box, to reach (a) the blocked regime before the zonal 
regime and (b) the zonal regime before the blocked regime. 

values of the 3'’’^ principal component while trajectories going from the blocked to the zonal 
regime favor high values of the 3'”'^ principal component. 


B. Early warning indicator 


The presence of preferred transition paths from the zonal to the blocked regime, as well 


as the weak mixing associated with the existence of rates close to zero (cf. section IVA), 
suggests that there is potential skill in predictability of transitions to the blocked regime. 
Note that more than trying to predict when a trajectory will leave the zonal regime, we 
want to use the fact that trajectories leaving the zonal regime are more likely to go to 
the blocked regime if they transit through negative values of pca. We therefore use the 
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transition matrices for different lags to provide an approximation of the transfer of densities 
in the rednced phase space by the flow and to bnild an early-warning indicator of transitions 
to the blocked regime. 

The qnality of the indicator will depend on (i) the predictability of the fnll system (in 
terms of sensitive dependence to initial conditions or rate of decay of correlations) and (ii) 
the validity of the Markov approximation for the rednced dynamics. While the former is de¬ 
termined by the dynamical system alone, the latter depends on the choice of the observable. 


We have seen in section IV that the violation of the semigronp property indicates that the 
reduced dynamics are overall non-Markovian. However, the constancy of the leading rates 
also indicates that motions associated with the meta-stable regimes behave as Markovian, 
suggesting that densities transferred by the estimated transition matrices could be used to 
give a probability of reaching one of the regimes. 

The early-warning indicator is thus designed as such. Whenever the observed trajectory 
(in our case, the simulation) leaves the zonal regime, we dehne an initial density /Bj,T= 0 ; 
integrating to 1 over grid-box Bi to which the last observation belongs and to zero elsewhere 
(one could instead dehne an initial density spreading over several boxes to account for 
uncertainties in the observation). Next, the Markov approximation fB^^T=i of the transfer of 
fBi,T=o by the how after a lag r of 1 day is calculated as fB-^T=oPT=i- We then calculate the 
likelihood P( 2 /r=i e i?biock| 2 /o e Bi) = ^ realization yt+i to belong to 

the blocking regime knowing that i/t belongs to grid-box Bi. If this estimated probability 
exceeds a given critical probability Pc, an alarm of transition to the blocked regime after a 
lag Taiarm of 1 day is given. Otherwise, the same process is repeated for r = 2, 3,... until an 
alarm is given or a limit lag Xmax is reached, after which we wait for the next observation to 
run the forecasting system. 

To illustrate this process we show in hgure the transferred density for grid box 314 
marked as a blue square and lying in the region of preferred transition paths between the 
zonal and the blocked regimes. This initial density ,-=o, is plotted alone panel d). The 

densities /Bi= 3 i 4 ,T= 4 , /Bi= 3 i 4 ,T =8 and f transferred using the transition matrices Pt=A) 
Pt =8 and Pt=i 6 are plotted panel (a), (b) and (c), respectively. The densities /Bj^ 3 i 4 ,r= 2 x 4 and 
fBi^ 3 i 4 ,T= 2 x 8 transferred using the square of the transition matrices Pr =4 and Pt =8 are also 
plotted panel e) and f), respectively, to show how the semigroup property can be violated. 
The corresponding likelihoods to reach the blocked regime are written in the top left of each 
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panel. 

We can see that if a critical probability pc of 0.3 would be chosen, an alarm of transition 
to the blocking regime would be raised for r = 8 days (hgure |^), as described in the 


previous paragraph. Figure 10 shows the grid boxes for which an alarm would be given if a 
critical probability Pc of 0.3 was used and the respective lag Taiarm after which the transition 
is predicted to occur. In this case, alarms are mainly flagged for trajectories passing through 
grid boxes close to the blocked regime or in the region of low principal component pca. 

The quality of the early warning indicator of transition to the blocked regime was tested 
over all trajectories starting when leaving the zonal regime and ending when reaching any of 
the regimes. In order to take into account sampling errors in the estimation of the transition 
matrices, the test was performed on a second 500, 000-day-long simulation which was ran 
taking as initial state the last state of the original simulation from which the transition 
matrices were estimated. Out of the 11759 trajectories starting from the zonal regime, 1993 
reached the blocked regime, which we call an occurrence event (O), while 9766 reached the 
zonal regime, which we call a non-occurrence event (O). 

When testing a forecast system, one is interested in how many alarms (A) were given 
when the event actually occurred, the number of so-called hits, and how many did not occur, 
the number of so-called false alarms. In the case of a hit, the event occurs at a given time Tqcc 
after the alarm is given, so that not only the occurrence of the event should be forecasted, 
but also when it will occur. For easy reference, the different cases are shown in Table 
For example, a forecast that a blocking event will occur in a month although it will truly 
occur only a week later is not of much help. For the purpose of assessing the skill of a 
forecast not only in terms of occurrence but also in terms of precision of the forecasted date 
of occurrence, we need to dehne a tolerance e, in days, such that a hit (H) is granted only 
when an alarm is given with a prediction time Taiarm such that Tqcc — e < Taiarm < Tjcc + £• If) 
however, the event is predicted to happen too early (Taiarm < Ficc if is counted as a false 
alarm of type II (FA2), while if the event is predicted to happen too late (Taiarm > Tqcc + ^), 
it is counted as a missed alarm of type II (MA2). When the event did not occur and no 
alarm (A) was given, we count a correct rejection (CR). When an event occurs but no alarm 
is given, we count a missed alarm of type I (MAI) and when an event is predicted but does 
not occur, we count a false alarm of type I (FAl). 

To assess the skill of the forecasts based on the categories dehned in Table |T} we adapted 
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9: Markov approximation fBi^ 3 i 4 ,T of the transfer of the initial density fBi^ 3 i 4 ,T=o for a 
■ of (a) 4 days, (b) 8 days and (c) 16 days. The initial density is plotted in panel (d), 
grating to one over box i?j=3i4 and to zero elsewhere. The densities transferred using 
the square of Pt=a and of Pr=8 are plotted panel (e) and (f), respectively. The 
esponding likelihoods to reach the blo2§;ed regime are written in the top left of each 

panel. 
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FIG. 10: Predicted time of transition to the blocked state Xaiarm, in days, for each grid box 
where an alarm is given for a critical probability Pc of 0.3. 


TABLE I; Overview of the different cases which can occnr depending on the alarm given 

and the event occnrring. 



A 

A 

A 

A 


Tf < To - e 

|r/ - Tol < e 

Tf > To + e 


0 

FA2 

H 

MA2 

MAI 

0 

FAl 

FAl 

FAl 

CR 


the original Peirce skill score (PSS) [55l [56] to forecasts inclnding the time of occnrrence. In 
onr case, we dehne the PSS as 

SPeirce{Pc^ Hff(Pc) kAR.i(P(;), 

where IIR(pc) = H(Pc)/0(pc) is the hit rate, dehned as the ratio of the nnmber of hits over 
the nnmber of occnrrences, and FARi(pc) = FAl(pc)/0(pc) is the false alarm rate of type I. 
The hit rate gives the likelihood of giving an alarm when the event occnrs, while the false 
alarm rate of type I gives the likelihood that an alarm is given bnt the event does not occnr. 
If the hit rate exceeds the false alarm rate the PSS is positive and the forecast has skill. A 
PSS of 1 is reached for a perfect forecast, when alarms are only raised when an event will 
actnally occnr and with the right predicted time of occnrrence. 

The PSS is plotted hgnre [T^ versns the critical probability Pc and for different tolerances 
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FIG. 11: Peirce skill score of the probability forecast of reaching the blocked regime versus 
the critical probability pc for different tolerances e (straight lines). The average time lag in 
days for which an alarm is given is also plotted in black dashed line. 


e. A tolerance of oo means that the precision of the predicted date of occurrence is not 
taken into account in the skill score. It is evident that as the tolerance increases, the skill 
score increases as well. The PSS reaches a maximum for a critical probability around 0.5, 
depending on the tolerance. Such scores have to be put in perspective with the predicted 
time of occurrence Taiarm- Indeed, a successful prediction of occurrence of a blocking event 
1 day ahead is counted as a hit, but is not of much practical use. For this reason, we have 
also represented the average prediction time Xaiarm as a black dashed line on the same hgure 


11 It is of around two weeks for a critical probability of 0.3 and around one week for a 


critical probability of 0.45. Thus, depending on the hnal purpose of the forecast system, a 
compromise has to be found between the skill of the forecast system and how many days 
ahead an event is predicted on average. Choosing a critical probability of 0.3 would give a 
reasonably good skill score of 0.4 (better than a no-skill forecast) with an average prediction 
time of 2 weeks. 


To summarize, the main advantage of a forecast system relying on the transfer of densities 
is that it constitutes a very cheap way to account for the sensitive dependence on initial 
conditions of a chaotic dynamical system. 
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VI. ENERGETICS OF THE TRANSITIONS 


In this section we focus on the remaining question on how the dynamics of the barotropic 
model can explain (i) the persistence of each regime and (ii) the preferred transition paths 
from the zonal to the blocked regime through high values or PC 3 . 

To help clarifying these issues the hemispheric energy budget of the model is studied. 
First, the fields are decomposed in a r = 8 days running mean and a deviation from it. This 
decomposition yields for the streamfunction 


- 1 


ip = -Ip + ip' with Ip = — 


/‘I 


ft-f/2 


p) dt'. 


( 22 ) 


Inserting (22) into equation ([^and applying the running mean gives the equation of the 
mean relative vorticity 


+ JPP, V'^pj + f + h) + J{PjP V^Pj') = + hV^pj + V^Pj* 


dt 


(23) 


Subtracting (23) from ([^ gives the equation of the deviation from the running mean 


as 


dt 


+ J{Pj, VV') + + / + h) 


+J{pj', VV') - VV) = -kiV^pj' + fcaVV- 


(24) 


In order to obtain the equation of the hemispheric average of the mean kinetic energy 


E =< 


>, with < • > denoting the hemispheric average ^ ■ cos0d0dA, equa¬ 


tion (23) is multiplied by p^ and averaged hemispherically, giving 
dE 


=< J{P^', V^p)') > —2kiE — k2 < p)V^p> > — < pjV^pj* > . 


(25) 


The first term on the right hand side is equal to the opposite of the sum of the Reynolds’ 
stress terms which represent a conversion of mean to eddy kinetic energy. Finally, multi¬ 


plying (24) by p}', applying the running mean and averaging over the hemisphere gives the 
equation of the global eddy kinetic energy E' =< 
dE' 


> 


dt 


= -< pjJipj', V^p)') > -2kiE' -k2< pj'V^pj' > . 


( 26 ) 


The terms in the equations (25) and (26) were calculated from the model simulation results 
and we could verify that the calculated tendencies equated the sum of the right hand side 
terms but for a small error of up to 13% of the standard deviation of the tendencies due to 
the running average of a deviation not being exactly zero. 
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FIG. 12: (a) Hemispheric mean kinetic energy E in Jm“^, (b) Hemispheric eddy kinetic 
energy E' in Jm“^. The green line, starting with a black sqnare and ending with a black 
triangle, represents a 200 days long trajectory transiting from the zonal to the blocked 

regime. 


The energetics of the transitions can be stndied by plotting the kinetic energies (Fig. 12) 


and the terms in the energy bndget (Fig. 13) averaged for each grid box of the rednced 
phase space. To these plots, a 200 days long trajectory transiting smoothly from the zonal 
to the blocked regime is added in green, starting with a black sqnare and ending with a 
black triangle. It is first interesting to notice that low valnes of E' coincide rather well 
with onr dehnition of the regimes (Fig. [I^). That fact that the eddies are weak in the 
neighborhood of the regimes is mostly explained by low valnes of conversion to eddy kinetic 


energy (Fig. 13:), in particnlar for the zonal regime, and additionally by a negative forcing 


for the blocked regime (Fig. 13 >d). This stabilization of the flow in the region of the regimes 
is a good physical candidate to explain their persistence. 

We have seen (section |V]), that typical trajectories from the zonal to the blocked regime 


transit throngh the region of negative pcg. Fignres 12 and 13 allow ns to give a typical 


scenario for snch a transition. Each step of the following scenario is marked in hgnres 12 
and by the corresponding nnmber. 

1. Starting in the zonal regime, the Reynolds’ stress terms are small and the mean flow 
is stable. However, the positive forcing indnces an increase of the mean kinetic energy 
E. 
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FIG. 13: (a) Tendency of the hemispheric mean kinetic energy in Jm“^day“^, (b) 
Tendency of the hemispheric eddy kinetic energy in Jm“^day“^, (c) Hemispheric Reynold’s 
stresses (conversion from mean to eddy kinetic energy when positive) in Jm“^day“^, (d) 
Snm of hemispheric forcing and dissipation in Jm“^day“^. The green line, starting with a 
black sqnare and ending with a black triangle, represents a 200 days long trajectory 
transiting from the zonal to the blocked regime. 


2. As E increases and the forcing persists, the trajectory evolves to lower valne of pcg 
and eventnally leaves the zonal regime. 

3. The trajectory then reaches a region of pcj^ close to zero and low PC 3 . The forcing 
is still strong bnt the Reynolds stress terms begin to increase becanse the strongly 
sheared flow becomes barotropically nnstable for lower valnes of pCg, so that the total 
eddy kinetic energy E' continues to increase. 
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4. As the trajectory reaches lower values of pcj^, the forcing reverses but the Reynolds’ 
stress terms continue to increase as the barotropic eddies which emanated at the 
previous step develop, so that E is converted to E' which continues to increase. 

5. The trajectory then goes to larger values of pcg where the forcing is negative and 
energy is removed both from the mean flow and from the eddies so that the barotropic 
eddies decay. 

6. The trajectory eventually reaches the blocked regime for lower values of pcj^ where E' 
decreases due to the negative forcing coincident with relatively small Reynolds’ stress 
terms. The mean flow is once again relatively stable, although not as much as for the 
zonal regime, as can be seen for the relatively large Reynolds stress terms in the region 
of largest pc^ inside the blocked regime. 

This scenario is consistent with the mechanisms of chaotic itinerancy [26], heteroclinic 
connections [201128] and almost-invariant sets bounded by invariant manifolds ua. Indeed, 
when the reduced state is in the zonal regime, the flow is relatively stable as it belongs to 
what would be the basin of attraction of the zonal regime. However, as it moves towards 
the neighborhood of positive forcing, energy is given to the mean flow, the horizontal shear 
increases and the flow becomes unstable to small perturbations. The increasing Reynolds’ 
stress term indicates that the perturbation grows, starts to interact with the mean flow and 
enables the state to leave the basin of attraction of the zonal regime. 

VII. SUMMARY, DISCUSSION AND CONCLUSIONS 

Using concepts of transfer operators of dissipative dynamical systems, stochastic reduc¬ 
tion and meta-stable regimes, we developed an early warning indicator of midlatitude at¬ 
mospheric regime transitions. It was applied here to the transitions between a zonal and 
blocked flow in a barotropic hemispheric atmosphere model. 

Transfer operators yield the evolution of densities induced by the flow. Markov operators 
have been estimated on the reduced phase space to approximate the point spectrum of the 
generator of the transfer operators, namely the Ruelle-Pollicott resonances. Their approxi¬ 
mation for different lags have been used to assert the presence of slow dynamics associated 
with meta-stable regimes and the impact of memory effects on these motions. The presence 
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of rates close to zero and separated from the rest of the spectrum supports the existence of 
almost-invariant sets associated with meta-stable regimes and with time scales of three to 
six weeks. Furthermore, the relative constancy of these rates showed that memory effects 
are weak in the meta-stable regimes and along the transition path from the zonal to the 
blocked regime. 

In order to objectively dehne these regimes, we have developed an algorithm for the 
computation of almost-invariant sets based on earlier work in [6HI [ZHl EO], as well as on 
optimal Markov chain reduction [85l |86] and greedy optimization in networks [88]. The 
algorithm attempts to minimize a measure of the distance between the reduced Markov 
operator (giving the transition probabilities between the almost-invariants) and the original 
one and allows the detection of sets which are both recurrent and persistent. The algorithm 
has enabled us to robustly dehne the blocked and the zonal regimes. 

Compared to spectral almost-invariant detection algorithms such as developed earlier by 
[68 ] ITHHHn] . which are based on the decomposition of the leading eigenvectors of the transfer 
operator into characteristic functions supporting the almost-invariant sets, our algorithm is 
expected to perform better in the detection of more than two almost-invariant sets, since 
the latter uses an aggregative implementation rather than a divisive one like the spectral 
algorithms. For example, a similar algorithm |Hn| has been used for the detection of a dozen 
of communities, associated with spatial patterns of variability, in a correlation network of sea 
surface temperature [91]. In this study, we were only interested in the bi-partition problem 
and our algorithm showed comparable performance in terms of invariance with respect to 
that in [78] . 

The transfer operator based algorithms are also comparable to Hidden Markov Models 
[HMM, nn ng, with the difference that the former are non-parametric (no assumption is 
made on the distribution of the reduced states). Contrary to HMM, our algorithm detects 
hard clusters (non-overlapping sets) and takes as input the transfer operator rather than di¬ 
rectly the time series of the observable. In theory, it would be possible to adapt the algorithm 
to soft clustering (to allow overlapping between the clusters) but the optimization problem 
would become harder as it would necessitate another algorithm than the (combinatorial) 
greedy algorithm. 

The energy budget of the model showed that striking similarities exist between regions of 
low EKE and regions of almost-invariance, weak tendency and small memory effects. Low 
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EKE is indicative of stability of the mean flow and thus of almost-invariance, low tendency as 
well as weaker memory effects due to the small amplitude of the faster unresolved variables. 

The early warning indicator is based on a forecasting scheme involving the evolution 
of densities by the estimated transfer operator in a two-dimensional reduced phase space 
spanned by two EOFs of the model. It relies on the Markov approximation of the evolution 
of an initial density in the neighborhood of the latest observation of the system and on the 
estimation of the probability to reach the blocked regime. A warning is broadcasted for a 
lag 'Taiarm If this probability exceeds a prescribed critical probability. The quality of the early 
warning indicator, as measured by the Peirce Skill Score, is highest for a critical probability 
of about 0.5 but a smaller critical probability of 0.3 allows to emit warnings two weeks ahead 
of the event, on average. A next step is to investigate how this promising indicator would 
perform for more realistic models of atmospheric flow transitions. 

While the model here has obvious dehciencies (e.g. the lack of the representation of 
baroclinic instability), it is one of the midlatitude atmospheric models for which regime 
transitions are found and hence forms a nice test model for the development of the early 
warning indicator. The computational procedure can in principle be carried out with a 
General Circulation Models (GCM) exhibiting regime behavior [l6l[T8], if sufficiently long 
simulations can be performed. Note that in our case, we used a 500,000-day-long simulation 
(~ 1300 years) to assure the signihcance of our statistics. However, the dominant part 
of the spectrum was found to be robust to the use of only 50 x 365 samples which is the 
typical size of an operational GCM or a reanalysis record. The question remains if the 
transfer operator based early warning indicator would perform well with nearly Gaussian 
GCM or observational data [m cni CH], where it is not clear whether preferred transition 
paths between the meta-stable regimes exist. 

Another application could be to estimate transfer operators from a long run of an opera¬ 
tional weather prediction GCM and use these operators in parallel with a deterministic run 
of the GCM, in order to account for both uncertainty in the deterministic forecast |92], and 
to give an early warning indicator of transition to a new meta-stable regime. 

Finally, the early warning indicator presented here certainly extends those based on 
critical slowdown [50], such as the increase in variance and lag-1 autocorrelation, which 
dehnitely have problems in high-dimensional phase space. Even the network based indicators 
[5TH5^ are not readily extended to high-dimensional systems. It is therefore hoped that the 
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techniques in this paper will hnd application in many types of chaotic high-dimensional 
systems as found in physics and engineering. 
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Appendix A: Bootstrap applied to transition matrices 


This appendix is concerned with the limited length of the record used to estimate the 
transition matrices {Pt} dehned in section 


IIIB 


This statistical inference problem can result 

and 


IV A 


in errors, notably in the rates and the regimes calculated from {Pr} in section 
respectively, and requires an estimation of conhdence intervals. 

Following Chekroun et al. [19], we use a version of the non-parametric bootstrap 


IV C 


adapted to the estimation of transition matrices as done in [95]. It starts from the matrix 
Tj. counting the transitions of {yt\ from one grid box to another, before the normalization 
has been applied to get the transition probabilities. From T,-, Ng surrogate count matrices 
{T*g}i<s<Ns are generated. This is done, for each row z, by taking with replacement rii tar¬ 
get grid boxes among the n* transitions starting from grid-box Bi of Tr. This accounts for 
drawing n* times from a Multinomial distribution with vector of probabilities {(P’r)ii}i<i<m- 
From these Ng surrogate count matrices, the transition matrices {P*g}i<s<Ns are then calcu¬ 
lated by normalizing their rows. These matrices can then be used to estimate the sampling 
error of any functions of the transition matrix P-^. 

A thousand surrogate matrices {P*^}i<s<Ar^ are used to compute 99% conhdence intervals 
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for the rates represented in figure]^ For each lag r, the leading rates have been calculated for 
every surrogate transition matrix in {P*s}i<s<Ns- For each rate rj(r) one gets a distribution 
of surrogate rates {{ri{T))*}i<s<Ns- After sorting these distributions, the (0.005 * and 
the (0.995*A^s)*^ values are taken to give the lower bound and the upper bound, respectively, 
of the conhdence interval for rate rj(r). 


In this study, we added a bias correction to the intervals because the described construc¬ 
tion of the surrogate matrices introduces a bias towards lower values for rates, in particular 
for the secondary rates. We give an example to explain this bias. Assume that only two 
transitions start from a box Bi, with targets Bk and Bi. In the surrogate, the probability to 
pick two different transitions {Bk and Bi) will be of 1/2, while the probability to pick two 
of the same transitions (twice Bk or twice Bi) will also be of 1/2. Thus, a bias is introduced 
towards a weaker mixing resulting in lower values of the rates (in particular to the secondary 
ones more sensitive to such details). This bias was removed by centering the mean of the 
surrogate rates {(ri(r))*}i<s<Ars to the the value of the original rate ri(r) being tested. 


The surrogate transition matrices can also be used to test the robustness 

of the regimes to the sampling by running, for each the regime detection algorithm 


described in section IV C The membership of a grid box to a regime can be given by a 
membership matrix Mip such that Mip = 1 if grid box Bi belongs to regime Ep, and is 
zero otherwise. Confidence intervals are not well suited for statistics taking logical values. 
Instead, we compute, for each grid-box Pj, the fraction of surrogates for which Bi belongs to 
regime Ep. These fractions are plotted in figure [T4^,b for the blocked and the zonal regime, 
respectively, using Ng = 100 surrogates. 


We can see that the core of the regimes are very robust to the sampling, as indicated by 
the amount of grid boxes in each regime for which a fraction close to a 100% of the boxes 
of the surrogates have been attributed to the same regime. However, boxes along the path 
of the transition from the zonal to the blocking regimes are less robust to the sampling. 
Such exit-region and entering-region for the zonal and the blocked regime, respectively, are 
indeed harder to attribute to a regime, since they correspond to regions where the invariance 
is weak. A possibility would be to add a simulated annealing step in the almost-invariant 
sets detection algorithm as has been used, for example, in Rosvall and Bergstrom |86] . 
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FIG. 14: The value at each grid box represents the fraction of surrogate regimes for which 
the grid box belongs to (a) the blocked regime or (b) the zonal regime out of a 100 


surrogates. The surrogate regimes were detected by applying the algorithm of section |IV C 

to the bootstrap surrogate transition matrices. 


Appendix B: Robustness to grid resolution 


The robustness of the rates and the regimes to the grid resolution for which the transition 
matrices are estimated is tested next. 

Figure [T^, b represent rates calculated from transition matrices estimated on a grid of 
10 X 10 and 100 x 100 boxes respectively, to be compared with £gure|^of section IV A We 
can see that at least the 3 leading rates in green, red and cyan are not much affected by the 
grid resolution so that the analysis of section IV A| remains valid under changes of the grid 
in this range. This robustness to the grid of the leading rates is in agreement with the fact 
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FIG. 15: Rates T’j('r) as in figure]^ but calculated from estimates of transition matrices on 

a (a) 10 X 10 grid and (b) 100 x 100 grid. 

that they represent slow large-scale motions less likely to be affected by small perturbations 
of the transition matrices. 

Figure [T6^,b represent regimes detected from transition matrices estimated on a grid of 
20 X 20 and 200 x 200 boxes respectively, to be compared with £gure|^of section [IV C[ It can 
be seen that in both cases the regimes are very much alike to the one plotted hgure for the 
grid of 50 x 50 boxes. The quality of the regimes deteriorates for grids coarser than 20 x 20 
(not represented here) but remains relatively good for resolutions as rehned as 200 x 200 
(Fig. I^b). This robustness of the regimes to very refined resolutions can be explained by 
the aggregative nature of the almost-invariant sets detection algorithm. Indeed, one expects 
the estimates of the transition probabilities between grid boxes to deteriorate as the grid 
becomes thinner and as the number of samples by grid box decreases. However, because 
the algorithm iteratively agglomerates grid boxes into clusters, the transition probabilities 
between these coarser and coarser clusters become less and less sensitive to the sampling as 
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pci 

FIG. 16: Almost-invariant sets and their regimes as for figure]^ but for a grid of (a) 

20 X 20 and (b) 200 x 200. 

the number of samples by clusters increases with their size. 


Appendix C: Robustness of the regimes to the time lag 


In addition to the robustness test of the regimes to the grid resolution, we next test their 
robustness to the choice of the time lag of the transition matrix from which they are detected. 
Regimes detected for lags r of 7 and 50 days are plotted in hgure p!7^,b, respectively, to be 
compared to hgure of section IV C In this range, we can see that the quality of the regimes 
is robust to the lag. 

For lags shorter than 7 days, however, the algorithm is not able to distinguish the tran¬ 
sition path between the zonal and the blocked regime from the regimes themselves (not 
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shown here). Such result is in agreement with the fact that for lags smaller than 7 days, the 
complex pair of rates represented by green squares in figure more likely to be associated 
with the transition, is dominant over the real eigenvalue represented by red circles and likely 
to be associated with the meta-stability of the regimes. Thus the motions associated with 
the transition appear to be dominant over the meta-stability of the regimes for such short 
lags. 


The deterioration of the regimes for lags larger that 50 days is to be expected due to the 
fact that, as seen section IVA the leading rates of figure]^ associated with meta-stability, 
have time-scales not larger than 40 days, so that for longer lags, the motions associated with 
the meta-stable regimes are likely to decorrelate and cannot be detected by the algorithm. 
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Appendix D: Test of the semigroup property 


In this appendix, we justify that the distances plotted in figurej^ in section IV A[ between 
densities transferred by a power of 2 of Pg and densities transferred by P 2 X &1 are mostly 
explained by memory effects rather than by the Galerkin approximation or by the sampling. 

To do so, we first represent the same distances but calculated from transition matrices 
estimated from time series of only 250, 000 realizations and as much as 1, 000, 000 realizations 


(Fig. 18 a and b respectively), compared to the 500, 000 realizations from which the distances 
represented hgurej^have been calculated. We can see that neither plots seem to be affected 
by the short or long length of the time-series. One has to go to record lengths shorter than 
200,000 realizations to start to see a difference in the distances (not shown here). Thus, 
the distances represented in hgure are not likely to be due to the limited length of the 
time-series from which the transition probabilities have been calculated. 

Secondly, to test the effect of the Galerkin approximation on the distances, we have again 
reproduced figure using 500,000 realizations, but with a grid of 20 x 20 and 100 x 100 


(Fig. 19 a and b respectively). While for the 20 x 20 grid, the distances get larger, indicating 
that this increase is due to the Galerking approximation, the distances for the 100 x 100 
grid resolution (Fig. 19 b) are very much alike the one for the 50 x 50 resolution (Fig. |^. 
Note that this is also true when a record length of 1, 000, 000 realizations (not shown here), 
so that the similarities between the plots is not likely to be due to a compensating effect 
between the grid resolution and the number of samples per grid point. 

We can thus conclude that the distances represented in hgure for P| and P 2 X 8 are 
mostly due to memory effects introduced by the only partial observation of the barotropic 
model. 
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FIG. 18: As for figure 5, the value at each grid-point represents the distance, for an initial 
density of 1 at this grid-point, between the density transferred by the poser of Pg 
and a density transferred by Pkxs for k = 2. However, the transition matrices have 
been estimated from a time series of (a) 250, 000 days and (b) 1, 000, 000 days, compared 

to the 500, 000 of hgure 
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